Stochastic Convex Optimization: Faster Local Growth Implies Faster Global Convergence
نویسندگان
چکیده
In this paper, a new theory is developed for firstorder stochastic convex optimization, showing that the global convergence rate is sufficiently quantified by a local growth rate of the objective function in a neighborhood of the optimal solutions. In particular, if the objective function F (w) in the -sublevel set grows as fast as ‖w − w∗‖ 2 , where w∗ represents the closest optimal solution to w and θ ∈ (0, 1] quantifies the local growth rate, the iteration complexity of first-order stochastic optimization for achieving an -optimal solution can be Õ(1/ 2(1−θ)), which is optimal at most up to a logarithmic factor. To achieve the faster global convergence, we develop two different accelerated stochastic subgradient methods by iteratively solving the original problem approximately in a local region around a historical solution with the size of the local region gradually decreasing as the solution approaches the optimal set. Besides the theoretical improvements, this work also include new contributions towards making the proposed algorithms practical: (i) we present practical variants of accelerated stochastic subgradient methods that can run without the knowledge of multiplicative growth constant and even the growth rate θ; (ii) we consider a broad family of problems in machine learning to demonstrate that the proposed algorithms enjoy faster convergence than traditional stochastic subgradient method. For example, when applied to the `1 regularized empirical polyhedral loss minimization (e.g., hinge loss, absolute loss), the proposed stochastic methods have a logarithmic iteration complexity. Department of Computer Science, The University of Iowa, Iowa City, IA 52242, USA Department of Management Sciences, The University of Iowa, Iowa City, IA 52242, USA. Correspondence to: Tianbao Yang . Proceedings of the 34 th International Conference on Machine Learning, Sydney, Australia, PMLR 70, 2017. Copyright 2017 by the author(s).
منابع مشابه
Supplement for “Stochastic Convex Optimization: Faster Local Growth Implies Faster Global Convergence”
Theorem 1. Suppose Assumption 1 holds and F (w) obeys the LGC (6). Given δ ∈ (0, 1), let δ̃ = δ/K, K = dlog2( 0 )e, D1 ≥ c 0 1−θ and t be the smallest integer such that t ≥ max{9, 1728 log(1/δ̃)} D 1 0 . Then ASSG-c guarantees that, with a probability 1− δ, F (wK)− F∗ ≤ 2 . As a result, the iteration complexity of ASSG-c for achieving an 2 -optimal solution with a high probability 1− δ is O(cGdlo...
متن کاملStochastic Variance Reduction for Nonconvex Optimization
We study nonconvex finite-sum problems and analyze stochastic variance reduced gradient (Svrg) methods for them. Svrg and related methods have recently surged into prominence for convex optimization given their edge over stochastic gradient descent (Sgd); but their theoretical analysis almost exclusively assumes convexity. In contrast, we prove non-asymptotic rates of convergence (to stationary...
متن کاملOn new faster fixed point iterative schemes for contraction operators and comparison of their rate of convergence in convex metric spaces
In this paper we present new iterative algorithms in convex metric spaces. We show that these iterative schemes are convergent to the fixed point of a single-valued contraction operator. Then we make the comparison of their rate of convergence. Additionally, numerical examples for these iteration processes are given.
متن کاملStochastic Variance Reduction Gradient for a Non-convex Problem Using Graduated Optimization
In machine learning, nonconvex optimization problems with multiple local optimums are often encountered. Graduated Optimization Algorithm (GOA) is a popular heuristic method to obtain global optimums of nonconvex problems through progressively minimizing a series of convex approximations to the nonconvex problems more and more accurate. Recently, such an algorithm GradOpt based on GOA is propos...
متن کاملConstrained Nonlinear Optimal Control via a Hybrid BA-SD
The non-convex behavior presented by nonlinear systems limits the application of classical optimization techniques to solve optimal control problems for these kinds of systems. This paper proposes a hybrid algorithm, namely BA-SD, by combining Bee algorithm (BA) with steepest descent (SD) method for numerically solving nonlinear optimal control (NOC) problems. The proposed algorithm includes th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017